随着对图像重建任务的深度神经网络(DNN)的兴趣,其可靠性已受到质疑(Antun等,2020; Gottschling等,2020)。然而,最近的工作表明,与总变化(TV)最小化相比,它们与对抗性噪声相似,以$ \ ell^2 $ - 重构误差(Genzel等,2022)。我们考虑使用$ \ ell^\ infty $ -norm的不同鲁棒性概念,并认为本地化的重建工件比$ \ ell^2 $ -Error更相关。我们创造了对不足采样的MRI测量值的对抗性扰动,这些测量值在电视调查重建中诱导严重的局部伪影。相同的攻击方法对基于DNN的重建不如有效。最后,我们表明,这种现象是可以保证精确恢复的重建方法固有的,就像用$ \ ell^1 $或TV-最小化的压缩传感重建一样。
translated by 谷歌翻译
在这项工作中,我们考虑线性逆问题$ y = ax + \ epsilon $,其中$ a \ colon x \ to y $是可分离的hilbert spaces $ x $和$ y $之间的已知线性运算符,$ x $。 $ x $和$ \ epsilon $中的随机变量是$ y $的零平均随机过程。该设置涵盖成像中的几个逆问题,包括去噪,去束和X射线层析造影。在古典正规框架内,我们专注于正则化功能的情况下未能先验,而是从数据中学习。我们的第一个结果是关于均方误差的最佳广义Tikhonov规则器的表征。我们发现它完全独立于前向操作员$ a $,并仅取决于$ x $的平均值和协方差。然后,我们考虑从两个不同框架中设置的有限训练中学习常规程序的问题:一个监督,根据$ x $和$ y $的样本,只有一个无人监督,只基于$ x $的样本。在这两种情况下,我们证明了泛化界限,在X $和$ \ epsilon $的分发的一些弱假设下,包括子高斯变量的情况。我们的界限保持在无限尺寸的空间中,从而表明更精细和更细的离散化不会使这个学习问题更加困难。结果通过数值模拟验证。
translated by 谷歌翻译
Recent years have seen a proliferation of research on adversarial machine learning. Numerous papers demonstrate powerful algorithmic attacks against a wide variety of machine learning (ML) models, and numerous other papers propose defenses that can withstand most attacks. However, abundant real-world evidence suggests that actual attackers use simple tactics to subvert ML-driven systems, and as a result security practitioners have not prioritized adversarial ML defenses. Motivated by the apparent gap between researchers and practitioners, this position paper aims to bridge the two domains. We first present three real-world case studies from which we can glean practical insights unknown or neglected in research. Next we analyze all adversarial ML papers recently published in top security conferences, highlighting positive trends and blind spots. Finally, we state positions on precise and cost-driven threat modeling, collaboration between industry and academia, and reproducible research. We believe that our positions, if adopted, will increase the real-world impact of future endeavours in adversarial ML, bringing both researchers and practitioners closer to their shared goal of improving the security of ML systems.
translated by 谷歌翻译
Although machine learning based algorithms have been extensively used for detecting phishing websites, there has been relatively little work on how adversaries may attack such "phishing detectors" (PDs for short). In this paper, we propose a set of Gray-Box attacks on PDs that an adversary may use which vary depending on the knowledge that he has about the PD. We show that these attacks severely degrade the effectiveness of several existing PDs. We then propose the concept of operation chains that iteratively map an original set of features to a new set of features and develop the "Protective Operation Chain" (POC for short) algorithm. POC leverages the combination of random feature selection and feature mappings in order to increase the attacker's uncertainty about the target PD. Using 3 existing publicly available datasets plus a fourth that we have created and will release upon the publication of this paper, we show that POC is more robust to these attacks than past competing work, while preserving predictive performance when no adversarial attacks are present. Moreover, POC is robust to attacks on 13 different classifiers, not just one. These results are shown to be statistically significant at the p < 0.001 level.
translated by 谷歌翻译
Specular microscopy assessment of the human corneal endothelium (CE) in Fuchs' dystrophy is challenging due to the presence of dark image regions called guttae. This paper proposes a UNet-based segmentation approach that requires minimal post-processing and achieves reliable CE morphometric assessment and guttae identification across all degrees of Fuchs' dystrophy. We cast the segmentation problem as a regression task of the cell and gutta signed distance maps instead of a pixel-level classification task as typically done with UNets. Compared to the conventional UNet classification approach, the distance-map regression approach converges faster in clinically relevant parameters. It also produces morphometric parameters that agree with the manually-segmented ground-truth data, namely the average cell density difference of -41.9 cells/mm2 (95% confidence interval (CI) [-306.2, 222.5]) and the average difference of mean cell area of 14.8 um2 (95% CI [-41.9, 71.5]). These results suggest a promising alternative for CE assessment.
translated by 谷歌翻译
我们提出了一个分散的“Local2Global”的图形表示学习方法,即可以先用来缩放任何嵌入技术。我们的Local2Global方法首先将输入图分成重叠的子图(或“修补程序”)并独立地培训每个修补程序的本地表示。在第二步中,我们通过估计使用来自贴片重叠的信息的刚性动作的一组刚性运动来将本地表示将本地表示与全局一致的表示。 Local2Global相对于现有工作的关键区别特征是,在分布式训练期间无需经常昂贵的参数同步训练曲线的培训。这允许Local2Global缩放到大规模的工业应用,其中输入图甚至可能均不适合存储器,并且可以以分布式方式存储。我们在不同大小的数据集上应用Local2Global,并表明我们的方法在边缘重建和半监督分类上的规模和准确性之间实现了良好的权衡。我们还考虑异常检测的下游任务,并展示如何使用Local2Global在网络安全网络中突出显示异常。
translated by 谷歌翻译
数据增强是自然语言处理(NLP)模型的鲁棒性评估的重要组成部分,以及增强他们培训的数据的多样性。在本文中,我们呈现NL-Cogmenter,这是一种新的参与式Python的自然语言增强框架,它支持创建两个转换(对数据的修改)和过滤器(根据特定功能的数据拆分)。我们描述了框架和初始的117个变换和23个过滤器,用于各种自然语言任务。我们通过使用其几个转换来分析流行自然语言模型的鲁棒性来证明NL-Upmenter的功效。基础架构,Datacards和稳健性分析结果在NL-Augmenter存储库上公开可用(\ url {https://github.com/gem-benchmark/nl-augmenter})。
translated by 谷歌翻译
针对任务导向的对话系统的强大状态跟踪目前仍然限于一些流行语言。本文显示,给定以一种语言设置的大规模对话数据,我们可以使用机器翻译自动为其他语言生成有效的语义解析器。我们提出了对话数据集的自动翻译,并进行对齐,以确保插槽值的忠实翻译,并消除以前的基准中使用的昂贵人类监督。我们还提出了一种新的上下文语义解析模型,它编码正式的插槽和值,只有最后一个代理和用户话语。我们表明,简洁的表示降低了翻译误差的复合效果,而不会损害实践中的准确性。我们评估我们对几个对话状态跟踪基准的方法。在Risawoz,Crosswoz,Crosswoz-Zh和Multiwoz-Zh Datasets,我们将最先进的技术提高11%,17%,20%和0.3%,以共同的目标准确度。我们为所有三个数据集提供了全面的错误分析,显示错误注释可以模糊模型质量的判断。最后,我们使用推荐方法创建了Risawoz英语和德语数据集。在这些数据集中,准确性在原始的11%以内,表示可能的高精度多语言对话数据集,而无需依赖昂贵的人类注释。
translated by 谷歌翻译
Robots need to be able to adapt to unexpected changes in the environment such that they can autonomously succeed in their tasks. However, hand-designing feedback models for adaptation is tedious, if at all possible, making data-driven methods a promising alternative. In this paper we introduce a full framework for learning feedback models for reactive motion planning. Our pipeline starts by segmenting demonstrations of a complete task into motion primitives via a semi-automated segmentation algorithm. Then, given additional demonstrations of successful adaptation behaviors, we learn initial feedback models through learning from demonstrations. In the final phase, a sample-efficient reinforcement learning algorithm fine-tunes these feedback models for novel task settings through few real system interactions. We evaluate our approach on a real anthropomorphic robot in learning a tactile feedback task.
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译